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ABSTRACT 


A sequential search procedure for maximization of a single variable multimodal 
objective function is designed and investigated in this research. Existing sequential 
procedures require the function to be unimodal. Nonsequential methods, though not 
restricted in this sense, require a large number of samples. Results show that the pro- 


posed sequential method is in this case preferable. 
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Il, INTRODUCTION 


A. PURPOSE 

The purpose of this research is to design a sequential search procedure capable 
of estimating and locating the global maximum of a single variable, multimodal 
objective function with a predetermined accuracy. The procedure will be derived 
in the form of an algorithm such that it can be easily implemented by a digital 
computer. 
B. LITERATURE SURVEY 

A maximization problem for a digital computer provides an algorithm by which 
the objective function is evaluated or sampled for various values of its argument 
(sampling points). Such a procedure must prescribe a rule to choose the sampling 
points (to be called the sample rule) and a function (to be called the estimate) 
depending, in general, on all samples obtained and approximating the unknown value 
of the maximum, Shubert and Spang [Refs. 10 and 11]. The sampling rule divides 
these procedures into two general categories: sequential and nonsequential. Ina 
sequential procedure the sampling rule utilizes the values of previous samples to 
determine the location of the next sampling point. The procedure is called non- 
sequential if the sampling points are chosen, possibly by some random mechanism, 
in advance of any computation or experimentation. 

Hill, Spang and Wilde [Refs. 4, 11, and 13] have further subdivided these pro- 
cedures into three basic groups: (1) gradient methods; (2) sequential min-max methods; 
and (3) random and grid search methods. 

1. Gradient Methods 

The largest body of material in the literature is concerned with what has 
been referred to as "gradient methods" of maximization. These methods utilize the 
"hill-climbing" principle to determine the direction in which the objective function 
increases, i.e., measurements of the slope of the function are used as an indication 
of the direction toward the maximum. A general approach to gradient methods is 
found in Spang [Ref. 11]. The first sampling point is selected arbitrarily and depends 
primarily on the experimenter's subjective opinion as to the location of the maximum. 
If the objective function is such that the approximate location of the maximum cannot 


be determined in a subjective manner, the midpoint of the experimental region may be 


}] 





used as the "initial guess", Brooks [Ref. 2], where the experimental region is 
defined as the interval containing all possible sampling points for which the ob- 
lective function is defined. Computing time will be significantly reduced the 
closer the initial sampling point is to the true abscissa of the maximum. The 


remaining sampling points are determined by the iterative equation: 


aon, (1) 


where X , is the value of the sampling point at the n th iteration, he iS a positive 
constant, and D is the direction vector evaluated at the n th iteration. Fora 
single variable objective function, D_ is, of course, a scalar, Thus, if D_ is posi- 
tive the (n + 1) st sampling point will be to the right of the n th sampling point, 
the reverse will hold if D. is negative. The magnitude of he dD determines how 
large a step is taken in the direction specified by bee The iterations continue 


until 


h Disc (2) 


where ¢ > 0 is the desired accuracy in estimating the true location of the maximum. 
The inequality given by (2) is considered the stopping rule for this method. If the 
inequality is satisfied then x, is the estimate for the abscissa of the maximum and 
f(x) becomes the estimate for the maximum. If the inequality is not satisfied, then 
the procedure goes to (1). Although the various gradient methods differ in their 
choice of scale factor he and direction vector Ds the general approach to the 
problem is the same. 


For example, Spang [Ref. 11] uses as a choice of the direction vector dD 


df 
= ' (3) 
Ae 
where (3) is the value of the derivative at the nth sampling point. The sample values 
+] 


of the (n + 1) st and nth sampling points are used to choose the value of he in 


the following manner: 


Fe 7) >f & ) then ee = 


F(x 4) s f(x) then h- =: 


; 
Nn 
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eealae 
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2. Sequential Min-Max Methods 


Like gradient methods, sequential min-max methods operate under the 
assumption that the objective function is unimodal in the experimental region. In 
general, these procedures reduce the range of the independent variable by a pre- 
determined amount. Hence, it is possible to determine, before taking any samples, 
the number of iterations required to reduce the range of the independent variable a 
given amount. The method developed by Kiefer [Ref. 7], based on the Fibonacci 
numbers, is the most popular of all min-max methods. No other method can guaran- 
tee a shorter interval of uncertainty for all functions of the class considered (uni- 
modal functions defined in the experimental region [a,b].) Another related method 
can be found in Berman [Ref. 1]. Spang and Wilde [Refs. 11 and 13] outline the 
general approach discovered by Kiefer. 

This approach assumes that the maximura of the objective function initially lies 
within an interval aq, b as illustrated by Fig. 1. If two sampling points are selec- 
ted within this interval, say x " andx", where n is the iteration number and such 


| 2 


n rere ; : 
that x. <x 2! it is obvious that if 


I 


f (x")) > f (x",) then the maximum lies betweena , * " 


2 

fix.) fx") the th lies bet xb 
( < 9 n the maximum ites between De 
f (x")) = f (x'5) then the maximum lies between x, ae 


Whenever the equality condition occurs, either the interval |[ aq, xy ore | ae DJ 
is selected to maintain mathematical symmetry. Thus, by this simple test the size 
of the interval guaranteed to contain the maximum can be reduced. 

The sampling rule is based on the total number of tests, N, that must be 
performed. N can be determined once the experimenter has specified the desired 
accuracy in estimating the true location of the maximum. The length of the final 


interval, b -a =6 specifies the desired accuracy. 
N N WN 


Kiefer [Ref. 7] has shown that after N iterations 


ee a= o), (4) 





| 
a n n 

n x 
l Z 


Figure 1. Illustration for sequential min-max method 
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0 ~ % ) is the length 


of the original interval, see also Spang [Ref. 11]. Solving (4) for Uy the value 


where U_ is the value of the N th Fibonacci number and (b 
n 


of the N th Fibonacci number is determined. Using the table of Fibonacci numbers, 
the corresponding N, which is the total number of tests required to attain the de- 
sired accuracy Ay can be found. 


With N established the algorithm proceeds in the following manner: 


en NET <n (b -a )t+a 
a n on n 
Nt+t-n 
U 
se eae (GOs. = Can) et oc 
U a aa n 
N+1-n 
Pex f(x. ) set co = e = x, 
; x xX») seta | a,b oy 9 
n at n 
= b = 
f(x,)< F(x, ) set qy Xo PD a b 
ak.» n _ — fi 
F(x) ) = (x, ) set qd a, bd Xn 
or set qt yr By b 
3. n=N, set 6.) = b -a 


feeeINE| go TO |. 


Either f(a, )or f ( by ) could be used as an estimate of the maximum. 


N? 


For large values of n the ratios of the Fibonacci numbers given in step | of 


the algorithm approach a constant: 


ee 
= me Onse2 


S| 





and 


U, 
Peo LS 
CF a 





; eee: ae n 
Therefore the following approximation formulas can be used to obtain xy and Xo, 


forme |, ss, N: 





MeeeOneS2 (6 =a ) +a 


1 n n n 


m= O61 0b -a )r a. 
Pi n n n 


Xx 


see Spang [Ref. 11]. 
3. Disadvantages of Gradient and Sequential Min-Max Methods 

The major drawback of these methods is that they are successful only if 
the objective function is unimodal, i.e., has only one hump in the experimental 
region. If the objective function does not satisfy the unimodality requirement, 
these methods will be successful in reaching a local maximum at the best. This 
is obvious because the methods are based on the "hill-climbing" principle of 
moving the next sampling points in the direction in which the function increases. 
Since in many practical problems it is not possible to guarantee unimodality of the 
objective function, it is important to develop a search technique which finds the 
maximum of a multimodal objective function. Furthermore, gradient methods 
usually require further regularity conditions such as the existence of the first 
and second derivatives. 


4. Random Search Methods 


Unlike gradient and sequential min-max procedures, random search pro- 
cedures are not restricted to unimodal functions. These methods are nonsequential 
in that the previous sample values do not determine exactly where the next sampling 
point will be located. Various purely random methods can be found in Brooks, 
Karnopp, Spang and Zachharov [Refs. 3, 6, 11, and 14], 

The general procedure is to select the sampling points at random in the 
area where the maximum is located according to a fixed distribution. After a 
certain number of iterations have been performed, the largest value of the objec- 
tive function is considered to be the estimate of the moximum. Assuming that the 
maximum is equally likely to occur anywhere in the experimental region, [ a, b ], 
let (b -a) = d be the length of this interval. With mo prior knowledge con- 
cerning the location of the maximum, it is reasonable to use a flat density function 
over the interval [ a, b ]. A priori, the experimenter specifies the accuracy he 


desires in estimating the location of the maximum as Bey where bY is the largest 
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interval of uncertainty he is willing to accept after p iterations. The interval 


[a,b] is further divided inte N subintervals each of length 6 The ratio of a 


N° 
subinterval to the original interval is 


The probability that a sampling point is not in a particular interval is (1-g ) 
and the probability that it is still not in this interval after p trials is (1 -g ) i 
Thus, the probability of at least one of the sampling points being in this subinterval 


1S 
s= 1-(1-g)?P. (5) 


s can also be considered as the confidence level of one of the sampling points being 
in a specified interval. Solving (5) for p, the required number of sampling points 
can be determined as 
Be-1 Onl aiease) (6) 
log (1-g) 
Brooks [Ref. 3] and Spang [Ref. 11] tabulate the number of iterations required for 
various values of g ands. It can be seen from these tables that the number of sam- 
pling points (iterations) required increase quite rapidly with a reduction in g. 
Suppose x. is the value of the sampling point at the i th iteration, where 
i runs from 1 to p, p determined by (6); R. is the value of a random oo between 
OQ and 1 selected from a uniform distribution for the i th iteration; and f. is the esti- 


mate of the global maximum at the i th iteration; then the algorithm follows: 


aoe | 
2. Compute 


x, =a+Rd 

(Se 

] 1 
3. Sefiz=it] 

Compute 

x =a +Rd 

A A 

in = max {f._ 1, F(x )} 


ee 





CG * e A 
4. i=p, estimate the global maximum to be f atx . 


m5, goto 3. 
5. Grid Search Methods 


Grid search procedures are systematic in the sense that the sampling 
points are equally spaced a predetermined distance apart in the experimental 
region. The sample values for all sampling points are obtained and that which is 
the largest is considered the estimate of the maximum. 

Like random procedures and unlike gradient and sequential min-max 
procedures, these methods are successful in estimating the global maximum and 
its location for a multimodal objective function over the experimental region. 
These methods are also nonsequential. A method for grid search can be found in 
Spang [Ref. 11]. 

In general, the experimental region [ a, b ] is subdivided into N intervals 
is the accuracy the experimenter is willing to accept in 


of length 6, _, where & 


N N 


estimating the location of the global maximum for the objective function in [a, b ]. 


The number of sampling points, p, required by this procedure can easily be computed 


as 
d 
Dp. = re , (7) 
where d is the length of the original interval. This is about half as many iterations 
as are required by purely random methods to attain the same accuracy. 


Suppose x, is the value of the sampling point at the i th iteration, where 


cae ae : 
i runs from 1 to p, p determined by (7); By ' the length of the equidistant inter- 
A 
vals; and f, is the estimate of the global maximum at the i th iteration; then the 
i 


algorithm follows: 


l. Seti = ] 
2. Compute 
x =g+ o 
1 
oN 
f =f 
ee 





3. Seti=it] 


Compute 

x, = xX, +o 

i- | 5 
N 
fh = Pog 
ery ss)! 
A 
4, i =p, estimate global maximum to be a Ginx. 

ip, goto). 


6. Disadvantages of Random and Grid Search Methods 


The major drawback to random procedures is that the maximum is found 
only with some probability as long as the number of samples is finite. Furthermore, 
being nonsequential, random methods as well as grid search procedures require a 
very large number of samples to estimate the maximum and its location with rea- 
sonable confidence level and residual uncertainty. Although grid search procedures 
require about half the number of sample points required by random methods to 
attain the same accuracy, the number of samples required is still too large to be 
practical in many situations. 

7. Methods Combining Gradient and Nonsequential Search Procedures 

Several attempts have been made to combine the "hill-climbing" 
principle with nonsequential search to maximize a multimodal function over the 
experimental region. For various methods utilizing these procedures see Hill, 
Hartley, Matyas, Pijavskii, Vaysbord and Yudin [Ref. 4, 5, 8, 9, and 12]. 

Basically two approaches are used in this type of search: (1) finite 
random or deterministic global search procedures are used to locate favorable 
starting points in the experimental region with gradient methods applied in the 
intervals specified by the starting points; and (2) gradient methods are applied 
in the current interval being searched and at some random time during the search 
of this interval the search goes to another randomly selected interval in the exper- 
imental region; Matyas, and Vaysbord and Yudin [Refs. 8 and 12] illustrate this 
approach. 

In approach (1) the experimental region [ a, b ] is divided into N sub- 


intervals by some finite random or deterministic method. Each of the intervals 


specified by this procedure are then searched by gradient methods. The largest 
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sample value obtained is the estimate of the global maximum. The drawback of 
this method is that it does not guarantee that the global maximum will eventually 
be found. 

Approach (2) uses gradient methods about the initial sampling point until 
a random trial moves the search to another interval. After each sample is observed 
this random trial is performed. If the procedure moves to another interval, still 
another random trial is used to determine the exact interval to be searched. The 
method continues until the stopping rule is satisfied at which time the estimate for 
the global maximum is considered to be the highest sample value attained. The 
drawback of this method is that the disadvantages of purely random methods pre- 
vail, namely, a very large number of iterations (samples) are required. 

8. Conclusions 

The methods discussed above are either too stringent on regularity condi- 
tions and the requirement that the function be unimodal in the experimental region 
or impractical from the standpoint of the number of iterations required. Further- 
more, none of these methods provide a truly sequential approach to the problem 


of multimodality. 
C. APPROACH OF THE METHOD TO BE CONSIDERED 


In this research the approach will be to solve the maximization problem of 
multimodality in terms of a sequential sampling rule which iseasily implemented. 
The formulation will restrict the class of admissable functions to be maximized to 
those of a single variable and which are globally Lipschitzian. !n addition it will 
be assumed that there are never any errors present in the observed values of the 
function. The assumption that the function be globally Lipschitzian means that 


there is some constant C, the value of which is known, such that 
t 


ee (8) 
|x -x | 


t e e 
for any x, x , x #x , in the experimental region [a, b]. If the function is dif- 


/ 
ferentiable, the value of C is usually not too difficult to compute. It amounts to 
finding some upperbound on the function's derivative. In the case where the 


function is given empirically, the constant C can often be obtained from the 
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physical nature of the function. If the exact form of the function is unknown, the 
selection of C will have to be based on the experimenter's subjective judgement. 
In any case, if it is desired to estimate the value or location of the maximum with 
a predetermined accuracy, knowledge of C or some equivalent information is 
necessary to determine a stopping rule regardless of the method used. This is 
obvious since if nothing of this sort is known or assumed about the function, no 
conclusion can be drawn about the estimation error from a finite number of samples. 
The sampling rule for the maximization method under consideration and the 
convergence of the method are discussed in Chapter Il. Chapter II! describes the 
formulation of the computerized algorithm based on the procedures specified by 
the sampling rule and the convergence criteria. It will be described in Chapter IV 
how the algorithm, slightly modified, can be used to estimate the zeros of a func- 
tion. Several sample problems, ranging in degree of difficulty, were selected 
and solved by the computerized algorithm resulting from this research in an attempt 
to test the method and as a basis for comparison with the solutions obtained by 
other methods. The sample problems and their solutions are discussed in Chapter V. 
An experiment was performed to test the algorithm's sensitivity to the Lipschitzian 
constant and the shape of the objective function. The experimental procedure, 
Pett tendicenclusions arerdiscussed in Chapter VI. The results and conclusions 
of this research are discussed in Chapters VII and VIII and the recommendations 
for future research are discussed in Chapter IX. Appendices A - F contain the 


flow charts and program listings for the method under consideration. 


2) 





Il. DEVELOPMENT OF THE SAMPLE RULE 
FOR THE MAXIMIZATION METHOD TO 
BE CONSIDERED 


Consider the sequence of sampling points Xo, Xprvee ye X tin La, b ] 


and their corresponding sample values { f (Xo Jeach (x, Jers (x emo (G) 


foes hax + Cl]x-x 


o | 


Be eX) C | x-x, | 
F(x) <s f(x ) + S| Rox | 
Define 
F (x) = min {f(x ) + C x-x ; 9 
in = Oe et occa Nn kK | k | 3 ” 
to be the piecewise linear function passing through the points ( Xo F ( Xo eae 


( x aes (x, 1 ae (x, f (x, ) ) with the slope determined by the 
Lipschitzian constant C, defined by (8). Figure 2 illustrates the function i 
defined by (9). 

From Fig. 2 it can be seen that for n samples the whole function f is upper- 
bounded by - with at most (n + 2 ) peaks (including possible peaks at the end- 
points of the interval [a, b] . ) 


Define 


Z = max fee 
n n 


10 
xe [a,b] = 
Clearly, Ze will be the ordinate of the highest peak of F Let 


co = max fox) 
X € [a,b] 


be the global maximum of the function f and let 


A = 
Oo, = max { F(xXQ), «+0, f(x )} (11) 


be the estimate of « after n iterations (samples) have been observed. 
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Figure 2. Graph of function 
F (x )=min (f(x, ) +c | x-x 
n Kee Opel meinen, 1 
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Since F_ upperbounds f and 2. is the largest sample value observed so far, the 
n 


following is concluded: 


IA 


A 
p so sZ. (12) 
It seems plausible at this point to utilize the difference between Ze and 2. as 


a basis for selecting the sample rule under consideration. Define 
ne (13) 


as the maximum possible error between « and 4 = after n observations. Furthermore, 
let et be the abscissa of Z Since by (12) the global maximum « is between 
Ze and eo it follows that as e. defined by (13) decreases, the global maximum 
can be more accurately determined. 

The method under consideration seeks that choice of x4] that will minimize 


e . €_ is the optimal selection in the minimax sense, in that any other choice of 
n n 


x could fail to decrease e by the same or larger amount. 
n 


+] 
, Since the above procedure is both sequential and optimal in a minimax sense, 
the sampling rule of selecting the abscissa of the highest peak in ia is used for the 
method under consideration. That the sampling rule for the method under consideration 
is in fact optimal relative to the class of all functions that are Lipschitzian is proved 


by Shubert [Ref. 10]. 


The sampling sequence for this method is defined mathematically as follows: 


50 € ied, b ] 
where Xo is the initial sampling point, 

x such that 

n+] 

pe ) = Zn=0, ies oF 
otherwise arbitrary, where 

Z = max BX), 

: xX € [a, b | . 
F (x)= min rece sta G| x=x |. } ss 
n Open : ‘ 
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The nature of the sampling sequence for this method suggests that as nx, 
a ; fer, «2 z Cae ec zs —» 0. Shubert [Ref. 10] theoretically proves that 
these conditions are satisfied for the sampling rule just considered. 

The rate at which the error, eo defined by (13) approaches zero is worthy of 
consideration at this point. Shubert [Ref. 10] has shown that the slowest possible 
rate at which e approaches zero occurs when f = constant, for any arbitrary 


selection of the initial sampling point Xo: lt remains to be seen how fast e 


approaches zero when the initial sampling point is 
(a+b) 
oO” 2 


X 


The speed of convergence in light of the conditions specified above will now be 


studied. 


Suppose 
y={x e¢ [a,b]: f(x) =e } (14) 


is the set of all x for which the global maxinium is attained. Furthermore, since the 
case is being considered for f = any constant, let the constant be zero for ease of 


illustration. Define C > O as the value of the Lipschitzian constant. By defini- 


tion f(x) = O for every x ¢ [a,b], hence, © ie O for all n and 
y= [a,b]. 
A further implication about eC. defined by (13) can be made since 
ee 
n n n 
and = 0 
n 


for all n then Je Ze for all n. 

Utilizing the function i defined by (9) and the sampling rule for the method 
under consideration, a general pattern for the rate at which the error e. decreases 
can be determined. Figure 3 illustrates the sampling sequence when f = constant. 

It can be seen from Fig. 3 that each sampling point after n = 2 increases the 


number of peaks in F by two and that the two new peaks created are equal in 


n+ | 
height. Furthermore, it can be easily shown that the height of these two new peaks 
is equal to half the height of the peak that created them, see Fig. 4. The calcula- 


tions illustrated in Fig. 4 are as follows: 
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[Hustration of sampling sequence when f 


Figure 3. 
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Zo=-2. = 7. Z=2=-2Z2 
n | | r n r 
Z Tie an 
o> i 5 
2 


Hence, 


Z 
[a fe 
2 


The results of the computations performed above are tabulated in Table |, for 
n= 16. The purpose of Table | is to outline the sampling sequence so that a general 
pattern of the way e | decreases can be determined. 

Let k be the length of the interval during which the value of Zs does not change. 
Then after two samples have been taken, exclusive of the initial one, it can be seen 
from Table | that k increases with powers of two while Ze decreases at the same rate. 


The above reasoning can be stated mathematically in the following manner : 


If Jae n < eo 
then me 
. ok + 1 
ok +1 San 
implies that 
l ] 
ak +1 nw! 
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Hence 


C(b-a) 


n 


Lae 
n 


and it follows that if f = constant, the error, en decreases at least as fast as 
C(b-a) 
n e 
Of course, the estimation error resulting from a nonsequential (random or grid) 
search would also decrease as — . However, as experimental results of Chapter V 
n 

indicate, the actual rate of the sequential algorithm is typically much faster. 

To locate the global maximum in the experimental region it is necessary to 


determine the set ¥ of all x ¢ [ a, b ] at which the global maximum is attained. 


Let 


¥ = {xe [a,b]: F (x)26 }, (15) 


n n n 
forn= 0, 1, ..., where Fe is the function defined by (9). Since 


A A 
ae nee 


Cet tx), 


n+] n 


for every x ¢ [ a, b ] it follows that 


and fx) aE 


yoy cy ,n=0, Nearer (16) 


n+] 
Figure 5. illustrates the heuristic approach to locating the global maximum co . 
Consider the situation after (n+ 1) samples have been observed. The heavy solid 


lines in Fig. 5 represent the uncertainty in the location of the maximum if 


A e e e 

O49 Z ? If o oe @, then the intervals of uncertainty remain the same 

and onl x ) changes. However, if 4 = f(x , then the estimate 
nd only F ( ) changes owever, if oy ( ae) n 


moves closer to «from below and as a result the intervals of uncertainty will be 


reduced in length and more accurately determine the abscissae for locating © . 


Clearly, Y is the smallest subset of [ a, b ] that defines the location of o, 


ee +] contains all elements of v but may contain several elements that do not lead 


to the location of o . Finally, the largest uncertainty set obtained from the sampling 


sequence is Y . Thus, 
n 
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Locating the global maximum. 


Figure 5. 
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Ve ie +1 ee Y ,n=Q0, |, ..., defined by (15). Furthermore, 
it has been shown by Shubert [Ref. 10] that the Lebesque measure of y - Y con- 
n 


verges fo zeroasn4o. 
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lll, DEVELOPMENT OF THE COMPUTERIZED ALGORITHM 


A. PRELIMINARY CONSIDERATIONS 

The sampling rule developed in Chapter II lends itself nicely to the formulation 
of a computerized algorithm for estimating the global maximum and its location of 
any deterministic function of a single real variable defined in a closed interval 
[a,b]. 

The information gathered from all samples obtained so far must be stored in the 
computer's memory so that it can make the proper decision as to where the next 
sampling point should be located. 

Define the data set stored in the computer's memory after n samples have been 
observed as 


DR Ginec ent) 725) +. Ho 7H ie, } 


n Nn 
n 


wafere 2. =< 2. < «es 


ee 2 


< Za , and the vectors (t., Zeejee = |, 2p. wes He are the 
n 


coordinates of the maxima of i defined by (9). 


Let 


be the initial sampling point and f (x, ) the corresponding sample value. Further - 


0 


more, 


iz Bre mac | ox 


o |: 


a 
Z, = f (xq) +C |b - x, | 

Clearly, ZO and Zz, are the two maxima of Fo (x ) defined by (9) at the endpoints of 

the experimental region [ a, b ]. Zz, = 4%, since (b “0 )=-(a- Xo ) implies 

that 2, = f(x ) -C (a - Xp ) = ae see FAQ:s oe C9 = f(x, ) is defined as 


the estimate of oafter the zero (th) iteration. Let i = a, and t= b be the abscis- 


sae of ZO and zy respectively. 
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Figure 6. Zeroth iteration. 





B. FIRST ITERATION 
Define F 


x ) as the piecewise linear function connecting the points (a, z_ ), 
a 


o | 


(Xo, f (Xp Jeb Zz) ) . For this iteration, maxima of F. (x ) consist of the 


0 
vectors (a,z_ be Zi yo Srniets z= 4, the arrangement of the vectors in 
i ; = pata 
Dy is completely arbitrary. Suppose ‘Ho iP en 
= 2 & 

Dy {(b,z),(a,z ); Cg i - 
By (10) 

ZZ = Z 

0 Ho a 


The corresponding sample value for x, = ais f(x ) = f(a). Drop the vector 


] 
( th 1 Zy )=, z from the data set D, and add the new vector (tf, Z ) 


0 0 : 
where 
z= wg (7+ F(a) ) 
a are =n Th) ) 


see Fig. 6. The new set of data D, is then obtained by rearranging the vectors 


l 


(ao» Zp ) and (ti, Z ) in the order of nondecreasing second component. It is 


obvious that 


Thus, 
- ae eS 
D,= { Wy z),(b, Zi ae 


where © | = max { ? 6 Pa Ge) feos D, is again the set of coordinates for all 


maxima of F, (x ) defined by the piecewise linear function connecting the points 


l 
(a, f(a) ), (ts Zz). (xX F(x_) ), (b, f(b) ), see Fig. 7. 
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First iteration. 


Figure 7, 
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C. SECOND ITERATION 


By (10) Zz, = Hi, =z and by the sampling rule Xn = Hy = b. The 
corresponding sample value for Xo = b is f (b) . Drop the vector (th Zi )= 
1 1 


ab, Zz ) from the data set D, and add the new vector (ty, Z,) where 


OS hl af 7 
Z iGO a ly 2G 


z 
| 
t = 


[ 2 
mo | 


see Fig. 8. Arrange the data set such that 


DeaGiecem( te 2. 1? ps } 
2 voy Hy He 2 
where Z1, = max f Zp, Zz io th the corresponding abscissa; 


2 


Z,=min { Zp, Z io t the corresponding abscissa; and 


| 


A A 
Po ee 4 


D. n (th) ITERATION 


Ronnie Sn, 


= , A 
dD. ee ty ty Jiro 3. 


n 
By (10), ZA H and by the sampling rule ee He The corresponding 
sample value for <a = H is f TH Drop the vector CH *H from 


the data set and add the two new vectors ( t 2 ay (t, Z ) where 


a a [z, + Fr di 72 
n n 
ant ~ [2,, eine! / 2 C 
n n n 
i tH? ae 7 28 


see Fig. 9. The new set of data, D , is obtained by rearranging the vectors 
n 


+ ] 


(te z1) psec, ty 1! “HL 4] et th, zy, i. (tf, Ze ) in the order of 
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Figure 8. Second iteration. 
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nth iteration. 


Figure 9, 
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nondecreasing second component. Thus, D is the set of coordinates for all the 
n 


+ ] 
e e A “AN 
maxima of a 4 1 defined by (9) and ey] > max { Qa F(t, ees 
n 


eee >!) OPPING RULE 


The iterations continue until 


where ¢ > 0 is the desired accuracy in estimating the global maximum ¢, and 3 
is the largest value for f observed so far, see Fig. 10. The experimenter should be 
realistic when he specifies ¢ . Naturally, he would like the error to be zero, how- 
ever, this would be feasible in a finite number of iterations only if the function f 
coincides with the function F in the area of the maximum. Since this is usually 
not the case, he must be willing to accept some error between o and 2 that will 
allow the computer to execute the algorithm in a reasonable amount of time. Cost 


will increase as more accuracy is desired. 


F. REDUCTION OF DATA 

The main computational disadvantage of the algorithm as described so far is that 
from the second iteration on the memory content increases by one vector ( t., Zz. ) at 
each iteration, | .e., H = n for n> 3. This can be remedied to some extent by 
dropping at each iteration all vectors (f., Z. de D. such that Zz. < 2. 4-17 5ee 
Fig. 11. Although the function - is no longer being stored completely, it is easy 
to see that the sequence of sampling points generated by the algorithm remains the 


same as before. It is clear from Fig. 1] that since F upperbounds f and ¢ 2 On +] 
n / 


the abscissae of all maxima of i below © a will never be sampled anyway, due 
to the very nature of the sampling procedure. The shape of the function in the ex- 
perimental region will determine how fast the memory content increases. It has been 
shown in Chapter II that in the most unfavorable case, f = constant, it will increase 


by one vector per iteration. 
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Figure 10. Stopping Rule 
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Figure 11. Reduction of data. 
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G. LOCATING THE GLOBAL MAXIMUM 
Define D , 93 the data set 


A 
On 


in the computer's memory after the final iteration of the algorithm as described so 
p Y g 


far. Furthermore, define F <¢ as the corresponding function of F Bs defined by (9). 


I 


D then contains coordinates for all maxima of F_ which upperbounds the function 
€ e 


mex), i.e.,f(x)s F (x ) by (9). Thus, the second components Z. of all 
A 


Tn +] 
after the last iteration. The information contained in D . provides many alterna- 


A — 
(t.,z.)e D are ¢~ close tom and are at least as great as = 
- | E € 


tives for estimating the location ‘/S) of » in [ a, b ], depending on the desires of 
the experimenter. Four of these alternatives will be examined below. 


1. Alternative | 


This alternative could be used if the experimenter is interested in a single 
estimate for the location of sin [a,b]. Only a slight modification to the estab- 
lished algorithm is necessary to obtain these results. Simply carry along the abscis- 
sa of the largest sample value obtained so far in the data set D. Thus, the revised 
data set at each iteration becomes 

A 
dD. = { (ti,z,), voor (Ty, zy )i (Xs i : 
n n n 
Ai the final iteration (x ; 2.) will be the coordinates of the estimated oand its 
location in[ a,b]. 

The major drawback to this alternative is that it fails to estimate the loca- 
tion of all global maxima in [ a, b ], however, it is relatively simple and particu- 
larly useful if the function is unimodal or when only a single estimate for the loca- 
tion of the maximum is desired. 


2. ALTERNATIVE II 


This alternative may be successful in estimating all locations of the global 
maximum in[ a,b]. D_ contains all information necessary to obtain these results. 
€ 
From the set of first components of (t,, z.) in D — , obtain a new data 
€ 


set 
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ae _A 
Cra L Ati yp de ceer (tyr yy dige $, 
€ 


€ 


where y F(t.) ,i= 0, Sear: 


€ 


Define 
A 
et : (fey) € Se oo} 


as the set of points estimating the location of oin [ a, b ], see Fig. 12. 


The major drawback of this alternative is that an uncertainty set con- 
taining the locations of global maxima in [ a, b ] cannot be precisely determined, 


x ¢ T implies that 
a) = Es 


however, the true locations may still be outside of T. Figure 12 illustrates why 

this method may not be successful in estimating the locations of all global maxima. 

For this particular example, ( tg ¥g ) £ T because ¥3 < 2 . Even though te 

is close to an abscissa for which ¢ is attained it is not considered in this case. The 
abscissa t. is less close to an abscissa for which cis attained than is to and yet this 

value is considered an estimate by this method. 

Despite the fact that the method fails in this respect, it may still be used, 
with some reservations, if more than one location estimate is desired. The program 
listing for this alternative is located in Appendix E. 

3. Alternative III 
This alternative results in the genuine uncertainty set of the smallest pos- 
sible size. From the considerations at the end of Chapter Il, it follows that this set 


is given by 
_ 


€ 


oF 


Ve—miexere [a,o): Ff (x) = 
€ 
From the piecewise linear character of the function F it follows further that the set 
€ 
V is a union of a finite number of disjoint closed intervals 


aan le <. \ 
I. r. 


The following computations are needed to obtain V from the algorithm: Rearrange all 
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Figure 12, Alternative II. 
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(t. ,Z. ye DB _ in nondecreasing order of first components. Then compute: 


unless x, as computed above is less than the left most endpoint of the interval 


"1 


[ a, b ] at which time x becomes a. That is, 


| 


x = mox {a,t,-1 (z,- 5) }. 
- ; 
| A 
x = m eee on 
r. In { Ge ( e)3 5 
| A 
x, = io Seca sO) 3} 
i+ 1 ° 
The iterations continue until 
Kae x 
i. | 
ete 
at which time x becomes the endpoint of the first interval and x, the left- 
i 
most endpoint of the second interval. This procedure continues until i = H at 
€ 


which time the rightmost endpoint of the last interval is computed to be 


ee Xe The union of all these intervals belong to V. Clearly, yCV. See 
i H 


€ 
Fig. 13 for an illustration of this approach. 

Although this alternative provides the genuine uncertainty set of loca- 
tions for «, the number of intervals in the set becomes unwieldy and too clumsy to 
be practical. However, one can be assured that «9 is contained in at least one of 
these intervals. The flow chart and program listing for this alternative are located 
in Appendices C and F respectively. 

4, Alternative IV 


This alternative is an attempt to reduce the number of intervals in V, for 
clarity, at the expense of enlarging the uncertainty set containing » . The nature 
of the sampling rule is such that when the algorithm stops, the first components of 


(t., Z. ) ¢ D_ cluster about y defined by (14). By rearranging the vectors 
€ 
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a=t) ee | ty : 


Figure 13. Sketch of Alternative III. 
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(+., Zz.) ¢ D_ in nondecreasing order of first component, the breakpoints be- 
i I € 


tween clusters are usually well defined as relatively large gaps between some t., 
I 
fe i” in the rearranged data set. Utilizing the endpoints of these clusters and the 
j 
function F , it is a very simple task to consolidate the points in each cluster toa 


single raya of uncertainty. The set of all such intervals formed in this manner 
is the set of intervals under consideration. Clearly, V is a subset of the union of 
these intervals, since the new set will include those portions of [ a, b ] between the 
intervals in V which do not contain o. Hence, the new uncertainty set is larger 
than V but is more practical for presenting experimental results. 

Let k be the number of clusters in the rearranged data set and let 


H = { [u PeUme let. 7 | U , wo iy 
' ry iV ’ 


be the set of intervals such that uy is the leftmost endpoint in the i th cluster and 
i 
u. is the rightmost endpoint of the i th cluster in H, i= 1, ..., k ; see Fig. 14. 
i 
Clearly, [ UU | e« H does not define the entire interval of uncertainty under 
i 1 

consideration for the i th cluster. There is a portion of [ a, b | to the left of u | 

i 
and a portion of [ a, b ] to the right of u. which must be included in the uncer- 

i 

tainty set under consideration. This is obvious because 


A 
ae et 
€ € 


which implies that «~ could very well be contained in the intervals [ wir Uy | 


and [ uw ], see Fig. 14, where 


wy re (yo eS) 


The set of uncertainty intervals defined by 


A8 





Figure 14. Sketch of uncertainty set W-. 


WAARAAK 
MURS e Cee ae 








i-th cluster (i + 1) st cluster 


intervals forming the set V (Alt. Ill. ) 


eZ 
A added to V to obtain W (Alf. IV. ) 
O 


points forming the set T (Alt. Il. ). 
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W= f per a a ee 


is the set desired for this alternative. The flow chart and program listing for this 


alternative have purposely been deleted from the Appendices since the intervals 


can be rather easily obtained from results of Alternative III. 
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IV. THE COMPUTERIZED ALGORITHM AS A MEANS FOR 
LOCATING ZEROS OF A FUNCTION 
The algorithm resulting from this research can also be used to locate the zeros of 
a function. Only a slight modification to the algorithm is necessary to obtain these 
results. 


By setting 
g(x) =- | Ff (x) | , 


the transformation is such that all global maxima « of g in [ a, b ] occur at the zeros 
of f(x). This fact allows the computerized algorithm to conform nicely to the task 
at hand, If the function f has at least one zero, ihe global maximum of g is » = 0, 


therefore the new data set after each iteration will be 


7 roe 
Dae ( (t,2,), eee (ty a ty a 1) } 


The iterations continue as before until 


*H ees) 6° OF ae 
n n 
since ¢ = O, at which time the iterations stop. The zeros of f (x ) are estimated 
by applying the procedures discussed in Alternatives | - IV of Chapter III. The alter- 


native used will depend on the form and the accuracy desired by the experimenter. 
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V. SAMPLE PROBLEMS 


The results of the following sample problems were used as criteria for (1) test- 
ing the algorithm; and (2) comparison with results of the same problems obtained by 


other search procedures. 


A. SAMPLE PROBLEM | 


Given the deterministic function 
fG@xe\e— Oat xX = X 


estimate the global maximum « and its location in the experimental region [ 0, 2 ] 
with a desired accuracy of ¢ = .0O1. 
1. Discussion 

This problem was used throughout the design stages of the computerized 
algorithm due to its relative simplicity. It was primarily used in the design stages 
because the global maximum and its location could easily be determined by differ- 
ential calculus. 

Clearly, the function is unimodal in [0, 2], see Fig. 15, and by the 


calculus, 


Nelo 
eo 


Since the function is differentiable, the Lipschitzian constant C can be computed 


- 








as 
C= max c 
eras dfit_ max Giie2xe lee 9 
° s dx X € [ oF 2 ] 
2. Results 
For the desired accuracy of ¢ = .CI, the algorithm required 63 
iterations (samples) giving the estimate > = 3.25. It was determined a priori 
- =e 


from Fig. 15 that the number of global maxima was 1, hence, Alternative | was 

used to estimate the location of « . The algorithm gave x = .5 for this estimate. 

The largest number of vectors, (t., z. ), stored in the computer's memory was 40 
Lo i 


and the computer time necessary (including the time required to compute f (x. ) ) 


was 4.36 seconds. 
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f(x) 
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Figure 15. Graph of f (x)= ee 
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B. SAMPLE PROBLEM Il 


Given the function 


im sin{-(it+1)NX + i], 


i=] 
estimate the global maximum ¢ and its location in {[ .01, 10] with a desired 
accuracy of - = .O1. 


1. Discussion 


The results of Sample Problem | seem to indicate that the computerized 
algorithm under consideration is quite successful in maximizing a unimodal function. 
It remains to be seen how the algorithm reacts to functions which are multimodal in 
nature. The problem described above was used to test the algorithm's ability to 
maximize multimodal functions. 

The plotting package of an IBM 360/57 computer was used to plot f in 
[.01, 10 ], see Fig. 16. It is obvious from Fig. 16 that f is multimodal in [.01, 10] 
and that the global maximum occurs in the interval [.01, 1]. Although the func- 
tion f is differentiable, it is difficult to obtain the exact global maximum and its 
location using the techniques of differential calculus. Hence, gradient techniques 


were applied in [ .01, 1.0] to obtain 


ie eosin 
y = { 0.2416...) . 


at 


The fact that f is differentiable allows C to be computed as 
-~ 1/2 
C = max 2 -(i+l1)x = 350. 
Xe [ Ore 10 | oa 
= 


For sake of illustration, the sequence of sampling points x VS. the iteration 
number n were plotted, see Fig. 17. It shows how the search soon abandons 
regions of [ a, b ] where f (x ) is low and concentrates on search in the vicinity 
of maxima. The number of vectors stored vs. the number of iterations required 
were also plotted to illustrate the effects of the data reduction technique utilized 


by the proposed algorithm, see Fig. 18. 
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Figure 18. Number of vectors stored vs. number of iterations. 
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2. Results 
For the desired accuracy of ¢ = .01, the algorithm required 890 iterations 
giving the estimate © = 12.0312. Since the number of global maxima was deter- 
a priori from Fig. 16 S be 1, Alternative 1 was again used to estimate the location 
of» . x was estimated as 0.2415 . The largest number of vectors, (t., Z. ez 
stored in the computer's memory was 418 and the total computer time (including 


the time to compute f ( x )) was 23.96 seconds. 


C. SAMPLE PROBLEM II! 


Given the function 


2D 


f(x) =) ) isin{(i+]) x + 7], 


i=] 
estimate the global maximum wand its location in [ -10, 10] with a desired accura- 
cyofe = .Ol. 
IT. Discussion 

So far the method under consideration appears to be successful in maxi- 
mizing functions with a single global maximum. The function under consideration 
was used to determine the algorithm's ability to maximize functions with more than 
one global maximum. 

A computer plot of f was used to determine the nature of the function, 
see Fig. 19. It appears from Fig. 19 that f has more than one global maximum in 
the experimental region. The function is differentiable, but too difficult to locate 
the maximum using calculus. Gradient techniques were again applied in the inter- 


vals assumed to contain the maximum. These methods located the maximum 


ZO SNS nies 


I 


p 
MEC774A5 6 5791S ee} 


C was computed as 


at y 


C = max PlmiG@r ty} = 70. 
xe [-10, eI 
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Figure 19. Graph of f(x )=<2', isin [(i +1) xti]. 
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2. Results 
For the desired accuracy of ¢= .01, the algorithm required 444 iterations 
Svs ° A ; : : 
giving the estimate © = 12.0313. Examination of Fig. 19 revealed the possibility 
of more than one global maximum in [ -10, 10], hence, Alternative IV was used 
to locate ¢ in this interval. The residual uncertainty in the location was reduced 


to three intervals 


W = { [ -6.7907, -6.7595 ], [ -0.5129, -0.4261 ], 
[ 5.7749, 5.8061] } 


(there is a local maximum f (x ) = 12. 0312 ... at x = -0.4914 ....) The 
largest number of vectors, (t., Z. ) , stored in the computer's memory was less than 


250 and the total computer time was 13 seconds. 


D. SAMPLE PROBLEM IV 


Given the function 
5 : 
F(x)= > 7, isin [-(it1l)x+ti], 
i= | 
estimate the zeros of fin [ .01, 10] with a desired accuracy of e = .01. 
1. Discussion 
This problem was used to test the algorithm's ability to locate zeros of a 


function. The function 
g(x) = -|[F(x) | 


was plotted by the computer in an effort to determine the intervals in [ .01, 10 ] 
where 

max g(x)=0 

xe [.01, 10] 
(zero will be the global maximum of g as described in Chapter 1V.) The location 
(s) of the global maxima ~= 0 are the zeros of f. Figure 20 was used to approxi- 
mate starting points for gradient methods in order to locate the true zeros of f. The 
zeros were determined as (0.0215 ..., 0.0000), ( 0.6180 ..., 0.0000 ), 
er o775m 80,0000), (4.2232 ..., 0.0000) , { 6.3051 ..., 0.0000), 
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i= | 
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(9.0865 ..., 9.0000) , C was computed as 
| 


C= max 


s 2 
xe 01, 101 (]P 4 (i +1) » pee 350, 


2. Results 
For the desired accuracy of ¢ = .01, the algorithm required 2253 itera- 
tions. Alternative IV was used to estimate the locations of the zeros. The residual 


uncertainty in the location was reduced to six intervals 


[ 0.0214, 0.0239] , [0.6173, 0.6186 ], 
ieZe07 75273 |), | 4.2280 , 4.4323 | 
[ 6.2246, 7.2566] , [ 8.8457 , 9.1452] . 


The largest number of vectors ( ty Z. ) stored was 474 and the total computer time 


was 52.87 seconds. 
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VI. EXPERIMENT WITH THE COMPUTERIZED ALGORITHM 


A. PROCEDURE 
The following experiment was performed to test the algorithm's sensitivity to 
the Lipschitzian constant and the shape of the criterion function. To obtain data 


for the experiment, eighteen functions of the form 


max { 10 [ 1] - —) 
a, Bry (x) ( Bp _** 


mox (of 1-(2=2\" 1} if -l0<x < 0 
ry 


were used with the following ranges on the parameters: 


Oc < 10, 
Oe ees 5) , 
O<y< a 


Figure 21 illustrates the general form of the function f y and the eighteen 
aA, 8 


e e e . Z 
functions with their exact parameters are illustrated in Figie22. 
C was computed as 


C= max 


xe [-10, 10] d fa av 


d x 


The Lipschitzian constant C was allowed to vary from its minimum value defined by 
(17) to five times its minimum value for each of the eighteen functions. The com- 
puterized algorithm was used to maximize each variation of a ey and the 
corresponding results; namely, the number of iterations required and the largest 


number of vectors in D_ stored, were recorded in Table !!. All results are based 
n 


on a desired accuracy of ¢ = .Q1. 
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B. EXPERIMENTAL RESULTS 
A brief description of Table !I follows: 


‘alin oy es 100 (10e (18) 
xe [-10, 10] 7 ag 
: Y 
defined by (17). i oa , t= 1, 2, 3, 4, 5 were the variations of C used by the 
algorithm to estimate the global maximum « for the functions 1 - 18 specified by 
the shape parameters 7, 2, y . Figure 22 illustrates the eighteen functions under 
consideration. The data points are the number of iterations required and the larg- 
est number of vectors in Dd. for the corresponding i C in and function. 
The columns of Table Il and the graphs of the tabulated results, Fig. 23 and 
Fig. 24, seem to confirm the theoretical conjecture that the number of iterations 
and the number of vectors stored increase approximately in direct proportion to 
min ° 
The effects of the shape of the criterion function are not so readily apparent from 
Table Il and Figs. 23 and 24. In order to obtain these results it is necessary to de- 
termine the value of the data points for each function with different shape para- 
meters evaluated for the same constant C. From Table II it is obvious that the 


common C would have to be 100 since 
max C ,. = 100. 
min 


Unfortunately time was not available to compute the data points for each function 


using a common C. In an effort to use the available data to estimate the algorithm's 


sensitivity to function shape, it was hypothesized that fori = 1] 
mex (ud 
mines tL, 
Cry 
min 


where | is the number of iterations required when C = C in’ would approximately 

equal the number of iterations required if C_, = max C_, . Similarly, for i =1, 
min min 

it was hypothesized that 


max C.. 
m 


eS 
e 


min 
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where S is the largest number of vectors in D when C=C | , would approximately 
n min 


equal the largest number of vectors in D required if C = maxC .. This hypo- 
n min 


min 
thesis seems plausible since both iterations required and amount of vector storage 


required is approximately in direct proportion to 


max C ., 
min 
G 


min 

The results based on this hypothesis are tabulated in Table Il. Even from this 
table it is not readily apparent how the shape of the function affects the algorithm, 
due to the interaction of the parameters. However, the data points obtained for 
each function can be compared to their corresponding graphs in Fig. 22 to analyze 
these effects. 

The data points of Table II! have been rearranged in nonincreasing order and 
tabulated in Table IV. Table IV makes the comparisons between data points and 
corresponding figures a little more systematic. From these comparisons it becomes 
obvious that the largest number of iterations required and the largest number of 
vectors required to be stored occur when the function is relatively flat in the area 
of the global maximum « . Furthermore, one can see by a similar comparison that 
as other local maxima less than or equal to the global maximum ¢@ get closer to op 


the iterations and vectors in D_ increase accordingly. 
n 


C. CONCLUSIONS 

It was concluded from experimental! results that the algorithm is most sensitive 
to the value of the Lipschitzian constant C. If the value of C selected by the ex- 
perimenter (call this value C.) is greater than Cam defined by (18) then the search 


for the global maximum ¢p will require approximately s \times as many itera- 


min 
tions and vector storage required than would be required if C = C in . Further- 
more, as the number of iterations increase the amount of computer time necessary to 
make the computations increase proportionately. Similarly, as the number of vectors 


in D_ increase the amount of computer core memory necessary to store the vectors 
n 


increases proportionately. Hence, since computer cost is a function of time and 
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Table IV. Rearrangement of iterations/vectors for ease of comparison. 
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core required, the cost of the experiment will increase proportional to C . Clearly, 
S 


experimental costs will be optimized when C = C . . 
S min 


If the value for C cannot be obtained analytically and the nature of the func- 
tion is unknown, the experimenter must be cautious in his selection of C . If 
S 
C. < C in it is possible that the algorithm will fail completely. On the other hand 


the algorithm will always work if C. S C in’ but at a higher cost. Therefore, if 
the exact value for C in cannot be determined a priori, it would be better for the 


experimenter to risk an over estimate of C at a greater cost than to underestimate 
and accomplish nothing. 

It can also be concluded from the experimental results that the algorithm is 
sensitive to the function's shape, though not as sensitive as it is to the Lipschitzian 
constant. In regards to shape, the algorithm is most sensitive to how flat the 
function is in the area of the global maximum. Of secondary importance, but still 
significant is the height of all local maxima, in the experimental region, less than 
the value of ¢ . As the heights of these local maxima approach » , the computa- 
tion cost increases accordingly. Furthermore, though not established by experimental 
results, it seems plausible that as the number of local maxima relatively close to 
increase the cost will increase accordingly. 

Knowledge of the shape of the function to be maximized may be useful for 


making gross cost estimates or as a programming aide. 
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Vil. RESULTS 


A. RESULTS OF SAMPLE PROBLEMS 
Consider Sample Problem III of Chapter V. The grid search would require 
ae 
8, 


p 
N 


sampling points for the same accuracy of ¢ = .01, where d =| (-10) - (10) | = 20 


and 
by = Ze _ 02 = 000 285... 
a7 


Thus, the number of sampling points required by the grid search is approximately 


ee 


while the number of sampling points required by the sequential method proposed by 
this research was less than 250. Similar comparisons can be made for the other 
sample problems and the results would be relatively the same, namely, the number 
of sampling points required would be considerably higher for grid search than for 
the sequential method proposed by this research. 

Purely random methods would require about twice as many sampling points as 
the grid search for the same accuracy and for a sufficient confidence level. 

This indicates that for nontrivial functions, the convergence rate of the algorithm 


tends to be much faster than the slowest theoretical rate C (b-a) , 
n 


B. CONSEQUENCES OF MEASUREMENTS ERRORS OR SMALLER C THAN 

REQUIRED 

Figure 25 illustrates the consequence of a measurement error, in this case, 
when the measured value of the function is less than the actual value in the region 
containing the maximum. Figures 26 and 27 illustrate the consequences of a smal- 
ler C than required by the algorithm. 

In both cases F (x ) may fail to be an upperbound to f ( x ) in some regions of 
the domain [a,b 7 Hence, some regions may be excluded from future search. If 


these regions happen to contain points where the global maximum is attained, the 
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algorithm will result in error, see Figs, 25 and 26. In both of the cases illustrated, 
(Figs. 25, 26), the value of « will be underestimated and the location will be 
wrong. On the other hand it is possible for the algorithm to be successful even if 
C is less than required, see Fig. 27. If the portion of the function that requires C 
to be large lies considerably below the global maximum it is still possible to obtain 
desired results from the algorithm. 


The errors may be detected in some cases by obtaining 


at some point in the sampling sequence, however, this need not necessarily happen, 


see Fig. 26. 
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VIIT. CONCLUSIONS 


The results from the comparisons made between the sequential method proposed 
by this research and the nonsequential methods of grid and random search clearly indi- 
cate that the proposed sequential method is preferable to the nonsequential methods, 
especially in cases where the function to be maximized is difficult or time consuming 
to evaluate or when the sampling of a function is costly. 

The primary advantage the proposed sequential maximization method has over 
other methods that are sequential, such as gradient and min-max methods, is that 
the proposed method is not restricted by the modality of the function, nor does it 
rely on the regulairty conditions which must be satisfied by these other sequential 


methods. 





IX. RECOMMENDATIONS FOR FUTURE RESEARCH 


The sequential search procedure proposed by this research has proved success- 
ful in estimating the global maximum and its location of a function whose exact form 
in the experimental region was known. It is obvious that the same method could be 
applied to functions whose exact form is unknown. It is the recommendation of the 
author that future research be conducted in this area by investigating practical 
examples of the following type: 

Consider the "black box" case, where the output of the box is a function of a 
single independent input variable ranging over the experimental region. The 
selection of the Lipschitzian constant may be difficult in this situation but may be 
obtained, by the experimenter's previous knowledge of the system under investiga- 
tion or from experts who are familiar with the system. By data linking the input/ 
output signals of the box to the on line computerized algorithm the same results can 
be obtained. See Fig. 28 for an illustration of this approach. 

The algorithm proposed by this research could also be experimentally investiga- 
ted in light of maximizing discrete functions, which may lead to several interesting 
practical problems. 

In principle, it appears that the method could be extended to functions of 


several variables, but at present it does not seem to be computationally feasible. 
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APPENDIX B 


FLOW CHART FOR SUBPROGRAM ESTIMATING LOCATION 
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APPENDIX C 


FLOW CHART FOR SUBPROGRAM ESTIMATING LOCATION 
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